[Previous] [Next] [Index] [Thread]

Re: glimpseHTTP and httpd access control



>>>>> "bb" == Ben Beecher <beecher@columbia.edu> writes:

bb> We didn't install the mfs and getfile components of aglimpse
bb> because of the security problem you mentioned.

Ditto for us at St. Olaf College.

bb> The simplest solution for us is to deny access to any directories
bb> that contain .htaccess files.

Or has a parent directory which contains a .htaccess file.

>> riddle@rice.edu (Prentiss Riddle) wrote:

>> (1) Modify "cgi-bin/aglimpse" so that it eliminates the use of
>> "cgi-bin/mfs" and instead refers the user to the URL of the
>> original document.

That's the solution I've used.  It means that you don't get the
word(s) you searched for highlighted for you, but it's a small price
to pay.

>> (2) Modify "wwwlib/getfile" so that it parses and obeys access
>> control restrictions within ".htaccess" files, based on the IP
>> address of the client.  This still might run afoul of other access
>> control restrictions (e.g. per-user access) in ".htaccess" files.

It also wouldn't take care of restrictions found in the server's
access.conf file.

(I wrote my own CGI scripts for use with "glimpse" and am not very
familiar with the "glimpseHTTP" distribution.  Hence, I may not know
what I'm talking about.)  If "aglimpse" were modified to create URLs
in the form "/cgi-bin/mfs/url_goes_here", and mfs HTTP to retrieve the
document in question, then you wouldn't have a problem.

An older (?) package in Perl to make retrieval of docs via HTTP easy
is "url.pl" (and "ftplib.pl"), originally by:

# Oscar Nierstrasz 26/8/93 oscar@cui.unige.ch

Mfs would then get the text of the file via:

	$html = &url'get("http://foo.com/what/ever.htm")

... and then mangle the HTML all it wants.

>> However, both of these approaches still share a drawback: even the
>> Glimpse search report, which provides only filenames and the
>> excerpted lines which match a search, could be used to probe
>> restricted areas within a WWW data tree.  Even the leak of a
>> fragment of a restricted document might be considered a serious
>> security problem.

Heh.  That's a good point.  Relying on glimpseindex's own recursive
descent is a recipe for that kind of trouble.  You could, I suppose,
rely on using the .glimpse_exclude mechanism to assist.

Given the oddness of St. Olaf's web server & its directory hierarchy,
I wanted the recursive descent to include directories pointed to by
symlinks.  So I wrote a Perl script which would do the descent,
spitting out a list of files which were safe to index.  If a directory
contained a .htaccess file, it and its descendants are ignored.
(Which doesn't take care of the Restrictions In "access.conf" Problem,
but I don't have anything complicated there.  YMMV.)  Symlinks are
followed under certain circumstances (and an effort is made to avoid
symlinks which create cycles in the descent!)

That list of files is then fed to "glimpseindex -F".

-Scott
---
Scott E. Lystig Fritchie, UNIX Systems Manager       Co-founder:
Academic Computing Center, St. Olaf College          Twin Cities Free-Net
1510 St. Olaf Ave., Northfield, MN  55057            Organizing Committee
fritchie@stolaf.edu ... 507/646.3407                 (Minneapolis/St. Paul, MN)
"Activism is the killer app for the net." -- Steven Cherry <stc@panix.com>
"... the files from their disk's womb untimely ripp'd." -- Wm. UNIX Shakespeare


References: